Picture for Yicheng Xiao

Yicheng Xiao

LongLive-2.0: An NVFP4 Parallel Infrastructure for Long Video Generation

Add code
May 19, 2026
Viaarxiv icon

Awaking Spatial Intelligence in Unified Multimodal Understanding and Generation

Add code
May 05, 2026
Viaarxiv icon

SpatialEdit: Benchmarking Fine-Grained Image Spatial Editing

Add code
Apr 06, 2026
Viaarxiv icon

PixCLIP: Achieving Fine-grained Visual Language Understanding via Any-granularity Pixel-Text Alignment Learning

Add code
Nov 06, 2025
Viaarxiv icon

LongLive: Real-time Interactive Long Video Generation

Add code
Sep 26, 2025
Viaarxiv icon

Accelerating Parallel Diffusion Model Serving with Residual Compression

Add code
Jul 23, 2025
Viaarxiv icon

LoRA-Gen: Specializing Large Language Model via Online LoRA Generation

Add code
Jun 13, 2025
Figure 1 for LoRA-Gen: Specializing Large Language Model via Online LoRA Generation
Figure 2 for LoRA-Gen: Specializing Large Language Model via Online LoRA Generation
Figure 3 for LoRA-Gen: Specializing Large Language Model via Online LoRA Generation
Figure 4 for LoRA-Gen: Specializing Large Language Model via Online LoRA Generation
Viaarxiv icon

SAM-R1: Leveraging SAM for Reward Feedback in Multimodal Segmentation via Reinforcement Learning

Add code
May 28, 2025
Viaarxiv icon

TensorAR: Refinement is All You Need in Autoregressive Image Generation

Add code
May 22, 2025
Figure 1 for TensorAR: Refinement is All You Need in Autoregressive Image Generation
Figure 2 for TensorAR: Refinement is All You Need in Autoregressive Image Generation
Figure 3 for TensorAR: Refinement is All You Need in Autoregressive Image Generation
Figure 4 for TensorAR: Refinement is All You Need in Autoregressive Image Generation
Viaarxiv icon

MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO

Add code
May 19, 2025
Figure 1 for MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO
Figure 2 for MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO
Figure 3 for MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO
Figure 4 for MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO
Viaarxiv icon